SemanticScuttle - klotz.me » Tags: machine learning+data engineering

Tags: machine learning* + data engineering*

0 bookmark(s) - Sort by: Date ↓ / Title /

Google Launched LangExtract, a Python Library for Structured Data Extraction from Unstructured Text

Google has introduced LangExtract, an open-source Python library designed to help developers extract structured information from unstructured text using large language models such as the Gemini models. The library simplifies the process of converting free-form text into structured data, offering features like controlled generation, text chunking, parallel processing, and integration with various LLMs.

2025-08-09 Tags: machine learning, data engineering, python, google, langextract, llm, gemini, information extraction, e by klotz
Effortless Spreadsheet Normalisation With LLM

This article describes a workflow using Large Language Models (LLMs) to automate the process of normalising spreadsheet data, making it tidy and machine-readable for easier analysis and insights.

2025-03-15 Tags: data cleaning, data engineering, llm, machine learning, spreadsheet by klotz
Explainable Generic ML Pipeline with MLflow

An article detailing how to build a flexible, explainable, and algorithm-agnostic ML pipeline with MLflow, focusing on preprocessing, model training, and SHAP-based explanations.

2024-11-27 Tags: mlops, pipeline, mlflow, shap, xai, data engineering, feature engineering, machine learning, eda by klotz
Automating Data Pipelines with Python & GitHub Actions

An article discussing a simple and free way to automate data workflows using Python and GitHub Actions, written by Shaw Talebi.

2024-06-01 Tags: pipeline, python, github actions, machine learning, screwdriver, data engineering by klotz
Accelerating ML Application Development: Production-ready Airflow Integrations with Critical AI Tools

- standardization, governance, simplified troubleshooting, and reusability in ML application development.
- integrations with vector databases and LLM providers to support new applications -
provides tutorials on integrating

2024-05-11 Tags: openai, cohere, weaviate, pgvector, opensearch, apache, airflow, llm, data engineering, machine learning by klotz
Machine Learning on GCP: From Notebooks to Pipelines

Notebooks are not enough for ML at scale

2024-05-11 Tags: data science, data engineering, gcp, machine learning, vertex by klotz
Building an Open Source ML Pipeline: Part 1 | by Bennett Lambert | Apr, 2022 | Towards Data Science

2022-04-11 Tags: data engineering, pipeline, machine learning, minio, python, kubernetes, production engineering by klotz
The modern data pattern. Replyable data processing and ingestion… | by Luca Bigon | Jan, 2022 | Towards Data Science

2022-01-31 Tags: lambda architecture, repeatability, snowflake, dbt, pulumi, data engineering, machine learning, data warehouse by klotz
Tecton Feature Stora

2021-11-16 Tags: tecton, machine learning, data engineering, feature store by klotz
7 Best Python Packages Kagglers Are Using Without Telling You | Towards Data Science

2021-08-09 Tags: data science, data engineering, exploratory data analysis, shap, explainability, machine learning, jupyter, python, training, ensemble by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0

About - Propulsed by SemanticScuttle

SemanticScuttle - klotz.me

Tags: machine learning* + data engineering*

Linked Tags

Related Tags